BUILD: Add configurable CUDA architecture targeting with meson built-in options #719

michal-shalev · 2025-08-24T08:23:14Z

What?

Add support for configurable CUDA architecture targeting using Meson's built-in cuda_args and cuda_link_args options.

Why?

Replaces hardcoded CUDA architecture flags with configurable defaults using meson's built-in CUDA argument handling for better flexibility.

How?

Leverage Meson's built-in cuda_args/cuda_link_args compiler options
Apply sensible defaults (compute_80, compute_90) only when user hasn't specified custom values
Updated README with comprehensive CUDA architecture configuration section
Simplified code by removing custom flag handling logic

Usage

# Uses defaults (Ampere & Hopper: compute_80, compute_90)
meson setup build

# Target specific architecture
meson setup build \
    -Dcuda_args="-gencode=arch=compute_75,code=sm_75" \
    -Dcuda_link_args="-gencode=arch=compute_75,code=sm_75"

# Target multiple architectures
meson setup build \
    -Dcuda_args="-gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80" \
    -Dcuda_link_args="-gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80"

See README.md for complete documentation and additional examples.

copy-pr-bot · 2025-08-24T08:23:17Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

github-actions · 2025-08-24T08:23:22Z

👋 Hi michal-shalev! Thank you for contributing to ai-dynamo/nixl.

Your PR reviewers will review your contribution then trigger the CI to test your changes.

🚀

yosefe · 2025-08-24T13:03:48Z

meson.build

-    nvcc_flags += ['-gencode', 'arch=compute_80,code=sm_80']
-    nvcc_flags += ['-gencode', 'arch=compute_90,code=sm_90']
+    nvcc_flags_link = []
+    if get_option('nvcc_gencode') != ''


why need this check? wouldn't the for loop below be empty any way if options not set?

Because with a string option, ''.split(' ') returns [''], so the loop runs once and adds an empty flag. I changed nvcc_gencode to an array and removed this check and split to avoid empty args.

yosefe · 2025-08-24T13:12:31Z

meson_options.txt

 option('cudapath_inc', type: 'string', value: '', description: 'Include path for CUDA')
 option('cudapath_lib', type: 'string', value: '', description: 'Library path for CUDA')
 option('cudapath_stub', type: 'string', value: '', description: 'Extra Stub path for CUDA')
+option('nvcc_gencode', type: 'string', value: '-gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_90,code=sm_90',


It seems like can provide meson cuda build flags like this:

meson setup -Dcuda_args="-gencode=arch=compute_90,code=sm_90" \ -Dcuda_link_args="-gencode=arch=compute_90,code=sm_90"

Can you pls check it? so maybe no need to add specific option

I tested -Dcuda_args and -Dcuda_link_args and it's working.
I've updated the PR to use meson's built-in cuda_args and cuda_link_args instead of adding a custom option.

Signed-off-by: Michal Shalev <[email protected]>

tvegas1 · 2025-10-06T10:51:39Z

meson.build

+        '-gencode=arch=compute_80,code=sm_80',
+        '-gencode=arch=compute_90,code=sm_90'
+    ]
+    default_cuda_link_args = [


remove and use default_cuda_args instead

notice that previously we had nvcc_flags and nvcc_flags_link, I did not won't to change that logic, only to use the built-in meson option

i mean that default_cuda_link_args and default_cuda_args are the same, so we can remove the first one and use default_cuda_args line 114 too

nvcc_flags and nvcc_flags_link were the same, I can set default_cuda_link_args to default_cuda_args

i am suggesting to remove default_cuda_link_args = {} at line 100, and replace all other occurences of default_cuda_link_args by default_cuda_args.

tvegas1 · 2025-10-06T10:54:06Z

meson.build

+        cuda_args = default_cuda_args
+        add_project_arguments(cuda_args, language: 'cuda')
+    else
+        cuda_args = []


i guess we need to add_project_argument with cuda_args_option?

No, meson has built-in cuda_args and cuda_link_args options, but if it's not clear I can add another comment here

makes sense, but also why cuda_args=[]? i would have assumed we need to pass it to the cuda.compiles() below? or is it implictly passed already? in that case we could remove the args: cuda_args?

meson.build

michal-shalev · 2025-10-06T17:37:55Z

/build

ovidiusm · 2025-10-06T20:21:31Z

README.md

+- `compute_80,code=sm_80`: NVIDIA Ampere (A100, RTX 30xx)
+- `compute_86,code=sm_86`: NVIDIA Ampere (RTX 30xx consumer)
+- `compute_89,code=sm_89`: NVIDIA Ada Lovelace (RTX 40xx)
+- `compute_90,code=sm_90`: NVIDIA Hopper (H100, H800, H200)


Question 1: do we support Blackwell? Why is it not listed?
Question 2: what should be the value when packaging the wheel, since that needs to cover all platforms where the user may do pip install nixl

I didn't test it on a cluster with Blackwell, and I don't think we'll have time to test it for this release.

IMO it should stay the same, there are still defaults

michal-shalev self-assigned this Aug 24, 2025

michal-shalev requested a review from a team as a code owner August 24, 2025 08:23

pull-request-size bot added the size/S label Aug 24, 2025

github-actions bot added the external-contribution label Aug 24, 2025

michal-shalev requested review from brminich, iyastreb, ovidiusm, tvegas1 and yosefe August 24, 2025 08:24

yosefe reviewed Aug 24, 2025

View reviewed changes

michal-shalev force-pushed the add-nvcc-gencode-opt-to-meson branch 3 times, most recently from 7e58efd to 01da329 Compare August 24, 2025 14:25

copy-pr-bot bot temporarily deployed to GITLAB August 24, 2025 14:25 Inactive

copy-pr-bot bot temporarily deployed to SWX_AWS August 24, 2025 14:25 Inactive

copy-pr-bot bot temporarily deployed to GITLAB August 24, 2025 14:30 Inactive

michal-shalev added 2 commits August 26, 2025 01:25

BUILD: add nvcc_gencode option for CUDA architectures

0bc6672

make nvcc_gencode an array of flags

07321e7

copy-pr-bot bot temporarily deployed to SWX_AWS August 25, 2025 22:25 Inactive

copy-pr-bot bot temporarily deployed to GITLAB August 25, 2025 22:25 Inactive

copy-pr-bot bot temporarily deployed to SWX_AWS August 25, 2025 22:25 Inactive

michal-shalev force-pushed the add-nvcc-gencode-opt-to-meson branch from 0bb541f to 07321e7 Compare August 25, 2025 22:25

copy-pr-bot bot temporarily deployed to SWX_AWS August 25, 2025 22:25 Inactive

copy-pr-bot bot temporarily deployed to GITLAB October 6, 2025 08:37 Inactive

copy-pr-bot bot temporarily deployed to SWX_AWS October 6, 2025 08:37 Inactive

copy-pr-bot bot temporarily deployed to GITLAB October 6, 2025 08:38 Inactive

ovidiusm previously approved these changes Oct 6, 2025

View reviewed changes

Fix meson build default logic and update sm90 in README

807c8e8

Signed-off-by: Michal Shalev <[email protected]>

michal-shalev dismissed ovidiusm’s stale review via 807c8e8 October 6, 2025 10:44

copy-pr-bot bot temporarily deployed to SWX_AWS October 6, 2025 10:44 Inactive

copy-pr-bot bot temporarily deployed to GITLAB October 6, 2025 10:44 Inactive

copy-pr-bot bot temporarily deployed to SWX_AWS October 6, 2025 10:44 Inactive

copy-pr-bot bot temporarily deployed to GITLAB October 6, 2025 10:46 Inactive

michal-shalev requested a review from ovidiusm October 6, 2025 10:46

tvegas1 reviewed Oct 6, 2025

View reviewed changes

michal-shalev requested a review from tvegas1 October 6, 2025 17:36

copy-pr-bot bot temporarily deployed to GITLAB October 6, 2025 17:37 Inactive

copy-pr-bot bot temporarily deployed to SWX_AWS October 6, 2025 17:37 Inactive

copy-pr-bot bot temporarily deployed to GITLAB October 6, 2025 17:38 Inactive

ovidiusm reviewed Oct 6, 2025

View reviewed changes

michal-shalev force-pushed the add-nvcc-gencode-opt-to-meson branch from 3dda4df to 807c8e8 Compare October 11, 2025 18:38

copy-pr-bot bot temporarily deployed to SWX_AWS October 11, 2025 18:38 Inactive

copy-pr-bot bot temporarily deployed to GITLAB October 11, 2025 18:38 Inactive

copy-pr-bot bot temporarily deployed to SWX_AWS October 11, 2025 18:38 Inactive

copy-pr-bot bot had a problem deploying to SWX_AWS October 11, 2025 18:38 Failure

copy-pr-bot bot temporarily deployed to SWX_AWS October 11, 2025 18:38 Inactive

copy-pr-bot bot temporarily deployed to GITLAB October 11, 2025 18:38 Inactive

BUILD: Add configurable CUDA architecture targeting with meson built-in options #719

Are you sure you want to change the base?

BUILD: Add configurable CUDA architecture targeting with meson built-in options #719

Uh oh!

Conversation

michal-shalev commented Aug 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What?

Why?

How?

Usage

Uh oh!

copy-pr-bot bot commented Aug 24, 2025

Uh oh!

github-actions bot commented Aug 24, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

michal-shalev commented Oct 6, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

michal-shalev commented Aug 24, 2025 •

edited

Loading